Journal: bioRxiv
Article Title: Single-Cell Data Integration and Cell Type Annotation through Contrastive Adversarial Open-set Domain Adaptation
doi: 10.1101/2024.10.04.616599
Figure Lengend Snippet: A) Comparison of SAFAARI’s performance with the selected reference-based cell-type annotation models in both open-set and closed-set settings. The scRNA-seq data from eight different tissues in the Tabula Muris cell atlas was obtained where the gene counts were derived using two techniques: 10x Genomics and FACS-based cell capture in plates (FACS). For the performance assessment, either FACS or 10x was considered as the source dataset, and the other as the target dataset, to evaluate reference-based cell type annotation or label transfer in the presence of a technology-based domain-shift or batch effect. Two scenarios were considered: the closed-set, where only cell types common to both source and target datasets were included, and the open-set, where the target dataset contained an unknown cell type not present in the source dataset . B) Heatmap representing the confusion matrix across eight tissues (target: FACS), showing cell-type-specific annotation performance. Columns represent the actual cell labels, while rows show the predicted cell labels. The cell type coloured in navy blue represents the unknown cell type whose instances were removed from the source dataset. Colours in the viridis palette and indicate the proportion of cells relative to the sum of the column (i.e., values across columns should add up to 1.0). This represents the proportion of correct classifications (diagonal values) and misclassifications for each particular cell type represented by the column names. C) UMAP of open-set Label transfer result of SAFAARI on four human pancreas datasets generated with different technologies, including microfluidic (Fluidigm C), droplet-based (InDrops) and plate-based scRNA-seq (CEL-seq2, Smart-seq2) as detailed in . It demonstrates SAFAARI’s superior batch mixing, cell separation and unknown cell type detection.
Article Snippet: These methods range from microfluidic droplet-based platforms (such as 10x Genomics Chromium, Drop-seq, and inDrops) to plate-based scRNA-seq technologies like Smart-seq, Smart-seq2, and Smart-seq3, resulting in substantial heterogeneity across datasets.
Techniques: Comparison, Derivative Assay, Generated